Here is my "simplified" software diagram for the atar daq
graph TD; NaluFrontend["<b><a href='https://github.com/PIONEER-Experiment/atar_daq'>Nalu MIDAS Frontend</a></b><br>Coordinates MIDAS event construction"] subgraph Libraries NaluBoardLib["<b><a href='https://github.com/jaca230/nalu_board_controller'>Nalu Board Controller</a></b><br>C++ Wrapper around naludaq methods for configuring the board and starting readout"] NaluEventCollectorLib["<b><a href='https://github.com/jaca230/nalu_event_collector'>Nalu Event Collector</a></b><br>C++ API for launching collector threads. Handles receiving data over UDP, processing packets, and collecting into NaluEvents"] MidasLib["<b><a href='https://bitbucket.org/tmidas/midas/src/develop/'>MIDAS</a></b><br>Data acquisition framework"] ReflectCppLib["<b><a href='https://github.com/getml/reflect-cpp'>reflect-cpp</a></b><br>C++ reflection library used for serialization"] end subgraph Python Packages NaludaqPython["<b><a href='https://pypi.org/project/naludaq/0.31.9/'>naludaq</a></b><br>Python interface for Naludaq"] end subgraph Classes NaluBoardController["<b><a href='https://github.com/jaca230/nalu_board_controller/blob/main/include/nalu_board_controller.h'>nalu_board_controller</a></b><br>Provides methods for configuring the board and starting readout"] NaluEventCollector["<b><a href='https://github.com/jaca230/nalu_event_collector/blob/main/include/nalu_event_collector.h'>nalu_event_collector</a></b><br>Provides methods for starting collector threads and polling for events"] OdbManager["<b><a href='https://github.com/PIONEER-Experiment/atar_daq/blob/main/include/odb_manager.h'>odb_manager</a></b><br>Handles initializing and managing ODB structure for Nalu Equipment"] MidasFrontend["<b><a href='https://bitbucket.org/tmidas/midas/src/develop/include/mfe.h'>mfe</a></b><br>Handles MIDAS frontend logic"] end %% Connect libraries to the PythonPackages layer NaluBoardLib -->|Pybind| NaludaqPython %% Connect libraries to the Classes layer NaluFrontend -->|Uses| NaluBoardLib NaluFrontend -->|Uses| NaluEventCollectorLib NaluFrontend -->|Uses| MidasLib NaluFrontend -->|Uses| ReflectCppLib NaluBoardLib -->|Provides| NaluBoardController NaluEventCollectorLib -->|Provides| NaluEventCollector MidasLib -->|Provides| MidasFrontend ReflectCppLib --> |Used By| OdbManager
graph TD;
NaluFrontend["<b><a href='https://github.com/PIONEER-Experiment/atar_daq'>Nalu MIDAS Frontend</a></b><br>Coordinates MIDAS event construction"]
subgraph Libraries
NaluBoardLib["<b><a href='https://github.com/jaca230/nalu_board_controller'>Nalu Board Controller</a></b><br>C++ Wrapper around naludaq methods for configuring the board and starting readout"]
NaluEventCollectorLib["<b><a href='https://github.com/jaca230/nalu_event_collector'>Nalu Event Collector</a></b><br>C++ API for launching collector threads. Handles receiving data over UDP, processing packets, and collecting into NaluEvents"]
MidasLib["<b><a href='https://bitbucket.org/tmidas/midas/src/develop/'>MIDAS</a></b><br>Data acquisition framework"]
ReflectCppLib["<b><a href='https://github.com/getml/reflect-cpp'>reflect-cpp</a></b><br>C++ reflection library used for serialization"]
end
subgraph Python Packages
NaludaqPython["<b><a href='https://pypi.org/project/naludaq/0.31.9/'>naludaq</a></b><br>Python interface for Naludaq"]
end
subgraph Classes
NaluBoardController["<b><a href='https://github.com/jaca230/nalu_board_controller/blob/main/include/nalu_board_controller.h'>nalu_board_controller</a></b><br>Provides methods for configuring the board and starting readout"]
NaluEventCollector["<b><a href='https://github.com/jaca230/nalu_event_collector/blob/main/include/nalu_event_collector.h'>nalu_event_collector</a></b><br>Provides methods for starting collector threads and polling for events"]
OdbManager["<b><a href='https://github.com/PIONEER-Experiment/atar_daq/blob/main/include/odb_manager.h'>odb_manager</a></b><br>Handles initializing and managing ODB structure for Nalu Equipment"]
MidasFrontend["<b><a href='https://bitbucket.org/tmidas/midas/src/develop/include/mfe.h'>mfe</a></b><br>Handles MIDAS frontend logic"]
end
%% Connect libraries to the PythonPackages layer
NaluBoardLib -->|Pybind| NaludaqPython
%% Connect libraries to the Classes layer
NaluFrontend -->|Uses| NaluBoardLib
NaluFrontend -->|Uses| NaluEventCollectorLib
NaluFrontend -->|Uses| MidasLib
NaluFrontend -->|Uses| ReflectCppLib
NaluBoardLib -->|Provides| NaluBoardController
NaluEventCollectorLib -->|Provides| NaluEventCollector
MidasLib -->|Provides| MidasFrontend
ReflectCppLib --> |Used By| OdbManager
I identified "problematic" rate test parameters with this script
# Step 1: First filter based on Expected Data Rate df_filtered_initial = df[df['Expected Data Rate (KB/s)'] < 55000].copy() # Step 2: Define filtering conditions on this subset condition_1 = df_filtered_initial['Collector Error'].notna() & (df_filtered_initial['Collector Error'] != "None") condition_2 = ~df_filtered_initial['kBytes per sec'].div(df_filtered_initial['Expected Data Rate (KB/s)']).between(0.8, 1.4) condition_3 = ~df_filtered_initial['Frequency (Hz)'].div(df_filtered_initial['Data Rate (Events per sec)']).between(0.9, 1.1) condition_4 = df_filtered_initial['Frequency (Hz)'] > 1000 # Frequency must be above 1 kHz # Step 3: Create a reason column to track which conditions were met df_filtered_initial['Reason'] = '' df_filtered_initial.loc[condition_1, 'Reason'] += 'Collector Error; ' df_filtered_initial.loc[condition_2, 'Reason'] += 'Data Rate Mismatch; ' df_filtered_initial.loc[condition_3, 'Reason'] += 'Frequency/Data Rate Mismatch; ' # Step 4: Apply the additional filtering conditions filtered_df = df_filtered_initial[(condition_1 | condition_2 | condition_3) & condition_4].copy() # Display row count and first few rows for verification print(f"Filtered DataFrame has {filtered_df.shape[0]} rows.") filtered_df[['File', 'Frequency (Hz)', 'Data Rate (Events per sec)', 'Windows', 'Events Sent', 'kBytes per sec', 'Active Channels Length', 'Expected Data Rate (KB/s)', 'Collector Error', 'Reason']] # Define the output file path output_file = "filtered_data.txt" # Open the file and write each row in the specified format with open(output_file, "w") as f: for _, row in filtered_df.iterrows(): frequency = int(row['Frequency (Hz)']) windows = int(row['Windows']) channels = int(row['Active Channels Length']) computed_value = frequency * windows * channels # Format the line as: 0 0 0 {frequency} {windows} {channels} {computed_value} f.write(f"0 0 0 {frequency} {windows} {channels} {computed_value}\n") print(f"Filtered data has been written to {output_file}")
# Step 1: First filter based on Expected Data Rate
df_filtered_initial = df[df['Expected Data Rate (KB/s)'] < 55000].copy()
# Step 2: Define filtering conditions on this subset
condition_1 = df_filtered_initial['Collector Error'].notna() & (df_filtered_initial['Collector Error'] != "None")
condition_2 = ~df_filtered_initial['kBytes per sec'].div(df_filtered_initial['Expected Data Rate (KB/s)']).between(0.8, 1.4)
condition_3 = ~df_filtered_initial['Frequency (Hz)'].div(df_filtered_initial['Data Rate (Events per sec)']).between(0.9, 1.1)
condition_4 = df_filtered_initial['Frequency (Hz)'] > 1000 # Frequency must be above 1 kHz
# Step 3: Create a reason column to track which conditions were met
df_filtered_initial['Reason'] = ''
df_filtered_initial.loc[condition_1, 'Reason'] += 'Collector Error; '
df_filtered_initial.loc[condition_2, 'Reason'] += 'Data Rate Mismatch; '
df_filtered_initial.loc[condition_3, 'Reason'] += 'Frequency/Data Rate Mismatch; '
# Step 4: Apply the additional filtering conditions
filtered_df = df_filtered_initial[(condition_1 | condition_2 | condition_3) & condition_4].copy()
# Display row count and first few rows for verification
print(f"Filtered DataFrame has {filtered_df.shape[0]} rows.")
filtered_df[['File', 'Frequency (Hz)', 'Data Rate (Events per sec)',
'Windows', 'Events Sent', 'kBytes per sec',
'Active Channels Length', 'Expected Data Rate (KB/s)',
'Collector Error', 'Reason']]
# Define the output file path
output_file = "filtered_data.txt"
# Open the file and write each row in the specified format
with open(output_file, "w") as f:
for _, row in filtered_df.iterrows():
frequency = int(row['Frequency (Hz)'])
windows = int(row['Windows'])
channels = int(row['Active Channels Length'])
computed_value = frequency * windows * channels
# Format the line as: 0 0 0 {frequency} {windows} {channels} {computed_value}
f.write(f"0 0 0 {frequency} {windows} {channels} {computed_value}\n")
print(f"Filtered data has been written to {output_file}")
So my criteria are:
The expected data rate must be below 55 MB/s.
Must over over 1kHz trigger rate
The "normalized" data rate is outside the range [0.8,1.4].
The "normalized" event rate is outside the range [0.9,1.1]
The run contained an error
Using the above criteria, I created a parmeter space for the sequencer to go through. For each value in the parameter space I did a 1 minute long run. I took a sample every 4 seconds of midas' measured data rate and event rate. So each "problematic parameter set" had 15 sequential samples.
After collecting this data I average it and computed means and uncertainties like so:
# Function to compute Time to Stability def time_to_stability(data_rates, tolerance=0.01, window=3): """Returns the index where data rate stabilizes within tolerance of final value for a given run.""" final_value = data_rates.iloc[-1] # Assume last value is steady-state threshold = final_value * (1 - tolerance) # Define stability threshold for i in range(len(data_rates) - window): if np.all(data_rates.iloc[i:i+window] >= threshold): return i # First index where stability is reached return len(data_rates) # If never stabilizes, return full lengths def count_collector_errors(errors): """Counts the number of non-'None' and non-'N/A' collector errors.""" return errors[~errors.isin(['None', 'N/A'])].count() # Define the aggregation functions for each column agg_funcs = { 'Frequency (Hz)': 'mean', # No uncertainty needed 'Data Rate (Events per sec)': [ 'mean', # Mean lambda x: np.std(x) / np.sqrt(len(x)), # Uncertainty (standard error) time_to_stability # Compute stability time ], 'Windows': 'mean', # No uncertainty needed 'Events Sent': 'max', # Take the maximum value 'kBytes per sec': [ 'mean', # Mean lambda x: np.std(x) / np.sqrt(len(x)) # Uncertainty (standard error) ], 'Active Channels Length': 'mean', # No uncertainty needed 'Expected Data Rate (KB/s)': 'mean', # No uncertainty needed 'Collector Error': count_collector_errors # Count occurrences of actual errors } # Perform the groupby aggregation consolidated_df = df.groupby('Run Number').agg(agg_funcs) # Rename the columns for clarity consolidated_df.columns = [ 'Avg Frequency (Hz)', 'Avg Data Rate (Events per sec)', 'Uncertainty Data Rate', 'Time to Stability', 'Avg Windows', 'Max Events Sent', 'Avg kBytes per sec', 'Uncertainty kBytes per sec', 'Avg Active Channels Length', 'Avg Expected Data Rate (KB/s)', 'Collector Error Count' ] # Compute the normalized values and their uncertainties consolidated_df['Normalized Frequency'] = consolidated_df['Avg Data Rate (Events per sec)'] / consolidated_df['Avg Frequency (Hz)'] consolidated_df['Uncertainty Normalized Frequency'] = consolidated_df['Uncertainty Data Rate'] / consolidated_df['Avg Frequency (Hz)'] consolidated_df['Normalized kBytes per sec to Expected Data Rate'] = consolidated_df['Avg kBytes per sec'] / consolidated_df['Avg Expected Data Rate (KB/s)'] consolidated_df['Uncertainty Normalized kBytes per sec to Expected Data Rate'] = consolidated_df['Uncertainty kBytes per sec'] / consolidated_df['Avg Expected Data Rate (KB/s)'] # Reset index for better readability consolidated_df.reset_index(inplace=True) # Display the consolidated DataFrame consolidated_df
# Function to compute Time to Stability
def time_to_stability(data_rates, tolerance=0.01, window=3):
"""Returns the index where data rate stabilizes within tolerance of final value for a given run."""
final_value = data_rates.iloc[-1] # Assume last value is steady-state
threshold = final_value * (1 - tolerance) # Define stability threshold
for i in range(len(data_rates) - window):
if np.all(data_rates.iloc[i:i+window] >= threshold):
return i # First index where stability is reached
return len(data_rates) # If never stabilizes, return full lengths
def count_collector_errors(errors):
"""Counts the number of non-'None' and non-'N/A' collector errors."""
return errors[~errors.isin(['None', 'N/A'])].count()
# Define the aggregation functions for each column
agg_funcs = {
'Frequency (Hz)': 'mean', # No uncertainty needed
'Data Rate (Events per sec)': [
'mean', # Mean
lambda x: np.std(x) / np.sqrt(len(x)), # Uncertainty (standard error)
time_to_stability # Compute stability time
],
'Windows': 'mean', # No uncertainty needed
'Events Sent': 'max', # Take the maximum value
'kBytes per sec': [
'mean', # Mean
lambda x: np.std(x) / np.sqrt(len(x)) # Uncertainty (standard error)
],
'Active Channels Length': 'mean', # No uncertainty needed
'Expected Data Rate (KB/s)': 'mean', # No uncertainty needed
'Collector Error': count_collector_errors # Count occurrences of actual errors
}
# Perform the groupby aggregation
consolidated_df = df.groupby('Run Number').agg(agg_funcs)
# Rename the columns for clarity
consolidated_df.columns = [
'Avg Frequency (Hz)',
'Avg Data Rate (Events per sec)', 'Uncertainty Data Rate', 'Time to Stability',
'Avg Windows',
'Max Events Sent',
'Avg kBytes per sec', 'Uncertainty kBytes per sec',
'Avg Active Channels Length',
'Avg Expected Data Rate (KB/s)',
'Collector Error Count'
]
# Compute the normalized values and their uncertainties
consolidated_df['Normalized Frequency'] = consolidated_df['Avg Data Rate (Events per sec)'] / consolidated_df['Avg Frequency (Hz)']
consolidated_df['Uncertainty Normalized Frequency'] = consolidated_df['Uncertainty Data Rate'] / consolidated_df['Avg Frequency (Hz)']
consolidated_df['Normalized kBytes per sec to Expected Data Rate'] = consolidated_df['Avg kBytes per sec'] / consolidated_df['Avg Expected Data Rate (KB/s)']
consolidated_df['Uncertainty Normalized kBytes per sec to Expected Data Rate'] = consolidated_df['Uncertainty kBytes per sec'] / consolidated_df['Avg Expected Data Rate (KB/s)']
# Reset index for better readability
consolidated_df.reset_index(inplace=True)
# Display the consolidated DataFrame
consolidated_df
I also computed 2 "new" metrics:
Here are the plots from the last round of analyzation just for comparison purposes. They don't give much insight into what's causing the lower data rates:
Note: The reson some of the data points are out of the range 55 MB/s expected data rate range despite the fact I made a cut to exclude those earlier is as folllows:
When I made the cut, I used data rate calculation:
\text{Data Rate (B/s)} \approx \text{Trigger rate}\cdot(\text{N}_\text{channels}\cdot\text{N}_\text{windows}\cdot(\text{Packet Length = 80 bytes}) + (\text{Event Header+Footer = 28 bytes}))
However, this is a mistake, it should be:
\text{Data Rate (B/s)} \approx \text{Trigger rate}\cdot(\text{N}_\text{channels}\cdot\text{N}_\text{windows}\cdot(\text{Packet Length = 80 bytes}) + (\text{Event Header+Footer = 34 bytes}) + (\text{Timing Data = 64 bytes}))
So our cut was off by a little bit
Here are some "new" plots that are a bit more telling:
Collector error analysis:
We can see how collector errors are dependenat on our parameters. For some reason, they were very prevalent in 2 channels. But the thing most correlated with collector errors is the expected data rate.
Time to stability analysis
Just \text{DataRate}[j] \geq 0.99\cdot \text{DataRate}[N]:
\text{DataRate}[j] \geq 0.99\cdot \text{DataRate}[N] and \text{DataRate}[j] \leq 1.01\cdot \text{DataRate}[N]:
We can see how time for the data rate to stabilize depends on our parameters. Again, it's most correlated with the expected data rate.
Average data rate with uncertainty vs parameters
We have see how the data rate depends on our parameters. We see artifacts of "skipping" event in there.
Normalied average data rate with uncertainty vs parameters
Same plot as above, just normalized based on \frac{\text{average data rate}}{\text{expected data rate}} This gives insight into which parameters are problematic. However, this "expected data rate" calculation has issues because I don't know how to correctly account for all the data going into midas. I.e. the logger logs some additional data (such as bank names, index, etc.) that skews the actual result upwards. as a result, I believe a lot of these events are actual "normal". They were just picked out by my cuts above due to this lack of understanding.
Normalized average event rate with uncertainty vs parameters
This is similar to the plot above, except we're plotting normalized event rate i.e. \frac{\text{average event rate}}{\text{input trigger frequency}}. I believe this plot is slightly more telling that the one above because we really expect every one of these data points to be on the red dotted line, otherwise we're missing events. Overall, it's unclear what's causing events to be missed. It's most correlated with the expected data rate.
I did a longer run where I don't just look at the failure modes to get more data for the "working modes". What I find is there are actually more errrors than expected. I.e. the 4 second tests did not reveal as many errors as the 60 second tests. I suspect this may have to do with the collectors "time_threshold" parameter choice. If it's not set properly, the collect can fill up. Below are some plots (very similar to above).